Extraction of Step-Repulsion Strengths from Terrace Width Distributions: Statistical 

and Analytic Considerations 
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Recently it has been recognized that the so-called gen- 
eralized Wigner distribution may provide at least as good a 
description of terrace width distributions (TWDs) on vici- 
nal surfaces as the standard Gaussian fit and is particularly 
applicable for weak repulsions between steps, where the lat- 
ter fails. Subsequent applications to vicinal copper surfaces 
at various temperatures confirmed the serviceability of the 
new analysis procedure but raised some theoretical questions. 
Here we address these issues using analytical, numerical, and 
statistical methods. We propose an extension of the gener- 
alized Wigner distribution to a two-parameter fit that allows 
the terrace widths to be scaled by an optimal effective mean 
width. We discuss quantitatively the approach of a Wigner 
distribution to a Gaussian form for strong repulsions, how er- 
rors in normalization or mean affect the deduced interaction, 
and how optimally to extract the interaction from the vari- 
ance and mean of the TWD. We show that correlations reduce 
by two orders of magnitude the number of independent mea- 
surements in a typical STM image. We also discuss the effect 
of the discreteness ( "quantization" ) of terrace widths, finding 
that for high misorientation (small mean width) the standard 
continuum analysis gives faulty estimates of step interactions. 

PACS Number(s): 05.40.+j,61.16.Ch,68.35.Md,68.35.Bs 



I. INTRODUCTION 

During the last decade a number of researchers have 
used atomic-scale microscopy to make quantitative ex- 
perimental measurements of the terrace width distribu- 
tion (TWD) of vicinal surfaces. To understand the data 
— and, especially, to extract the strength of the interac- 
tion between the steps — they have fit the TWDs with 
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Gaussians (or in cases of no apparent energetic repulsion, 
with free-fermion distributions). Recently there has been 
a significant improvement in the theoretical understand- 
ing of interacting steps on vicinal surfaces: as an example 
of a fluctuation phenomenon, they should be described by 
certain universal features related to random-matrix the- 
ory. In particular, the TWD should be well describable in 
terms of a generalized form of the distribution surmised 
by Wigner to describe some special cases of interactions 

In a recent paper j2|, hereafter GE, terrace width 
distributions (TWDs) of various vicinal copper surfaces 
were analyzed using both the traditional Gaussian ap- 
proach and the generalized Wigner surmise. Many con- 
clusions were noted in passing about the relative merits 
and sensitivities of these two approaches. The goal of this 
paper is to provide supporting details together with new 
results and approaches that should aid in the interpre- 
tation and analysis of experimental TWDs. We explore 
the relationship between the Wigner form of TWDs and 
the Gaussian. We discuss several statistical considera- 
tions that should be taken into account. The many issues 
treated by this paper arose during the course of analyzing 
experimental data in GE. 

The organization of this paper is as follows. Sec. || 
reviews the TWD derived from the generalized Wigner 
surmise and presents some practical new approximations 
derived from series expansions. In particular, we provide 
what we believe is the best simple expression [Eq. (|^)] 
to deduce the step-step repulsion strength from the vari- 
ance of the TWD. Sec. Ill deals with the approach of 
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the generalized Wigner distribution to the form of the 
Gaussian for strong step-step repulsions. While this be- 
havior had been recognized earlier, we now characterize 
it quantitatively. In section we contend with a re- 
curring theme in GE: the error generated by uncertainty 
in the mean of the distribution. Experimentalists had 
the belief that Gaussian fits of data are more forgiving 
of such errors than are Wigner fits. We study this no- 
tion quantitatively by checking for both distributions the 
effect of perturbations in normalization and in mean by 
fitting deliberately misnormed or displaced data. The 
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results of arguably greatest interest to experimentalists 
are in section ^ We describe a extension of our proposed 
analysis scheme for TWDs for which the first moment of 
the data does not conform well to the apparent mean. 
We propose treating the generalized Wigner distribution 
as a two-parameter function: in addition to the exponent 
g, the value of the effective mean (which scales the ter- 
race widths; cf. Sec. ||) is adjusted simultaneously in the 
non-linear least-squares fit. This procedure makes little 
difference for the "good" data reported in GE, but can 
have significant effect on "poorer" data glossed over in 
that paper. We present both graphical illustrations and 
thorough tabulations for the extensive data for vicinal 
copper discussed in GE. We also apply the Wigner dis- 
tribution to recently published data for vicinal Pt(llO). 
Sections VI and VIl offer a pair of warnings regarding 
how the discreteness of the terrace widths and the lim- 
ited size of the sample, respectively, can confound the 
analysis. In the former case, for the range of interac- 
tion strengths found in physical systems, discreteness 
becomes problematic for high misorientations, when the 
mean terrace width drops to just a few lattice spacings. 
In the latter case, we observe that statistical fluctuations 
due to the typical size might well account for some of 
the data sets labeled as "poor," rather than some system 
contaminant or measurement flaw. A conclusion summa- 
rizes the current state of our understanding. 



II. GENERALIZED WIGNER SURMISE: RECAP 
OF KEY FORMULAS AND NEW RESULTS 
FROM SERIES EXPANSION 



As has been discussed extensively before a new 

idea from random-matrix theory |^,^ is that fluctuations 
should exhibit certain universal behavior. According to 
the so-called Wigner surmise, the distribution of fluctu- 
ations can be approximated by 



Pg{s) = agS^ exp (-fegs^) 



(1) 



where s = £/{£), £ being the terrace width, and the con- 
stants bg and ag are given by 
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respectively. For brevity, we refer hereafter to this set of 
formulas as the CGWD (continuum generalized Wigner 
distribution). The CGWD can be derived in a more 
transparent fashion from a mean-fleld approximation . 

The approximate result in Eq. (|^), derived in Appendix 
A by asymptotic expansion, is new. It is consistent with 
Eq. (9) of GE in the neighborhood of g^i; it is within 
0.2% of the exact bg as calculated using gamma functions 
at = 2 and is within 0.05% of bg hy g ^ 4. 

Experimentally, a TWD is typically characterized by 
its variance a^. In principle might be determined di- 
rectly from the second moment of the TWD, but there is 
concern that this approach does not adequately minimize 



noise in the data, an issue we shall revisit in Sec. VII 



Thus, in practice, TWDs are fit to smooth functions; 
Gaussians are typically chosen, not just for their sim- 
plicity but because their use can be justified readily in 
the limit of strong elastic repulsion between steps. The 
variance of the TWD is then approximated by the vari- 
ance CTg. of the fitted Gaussian. We argue here and in GE 
that the CGWD given in Eq. (1) is scarcely more compli- 
cated than a Gaussian but provides a better accounting 
of the variance. For strong step repulsions, the variance 
of the fitted Gaussian is usually not very different from 
the variance cr^ of a CGWD, as is discussed more quan- 



titatively in Sec. III. For weak repulsions, however, it is 



well known that the TWD becomes too skewed to allow 
a satisfactory fit to a Gaussian. Experimentalists finding 
themselves in this predicament have been stymied on how 
to proceed quantitatively Significantly, a Gaussian 

fit to a TWD with nonnegligible skewness cannot even 
be expected to have the correct mean; the consequences 
of this fact are dealt with in much of the remainder of 
this paper. 

For the CGWD, the variance can be expressed simply 
in terms of bg. We can use Eq. (||) to obtain 
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for large values of g (e.g. is overestimated by about 
0.5% at e = 4 but just 0.1% at g = 10). 

The usual goal in an experiment is to extract the mag- 
nitude A of the elastic repulsion between steps, perpen- 
dicular to the step direction, given by A/£'^. All standard 
analysis procedures make a continuum approximation in 
the direction along the steps (perpendicular to the "up- 
stairs" direction); thereafter, A appears only in the form 
of a dimensionless interaction strength A = Ai3{kBT)^'^, 
where /3 is the step stiffness. In this conceptualization g 
is related to A by the equation 



iw - e(e-2)/4 



(5) 



which follows from mapping this problem onto the 
Sutherland Hamiltonian B . The subscript W provides a 
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reminder that this estimate of A is based on the CGWD. 



Eq. (4b) can be solved for g, which in turn can be inserted 
into Eq. (||) to provide a good estimate for Aw How- 
ever, a much better estimate of Ayf — visually indistin- 
guishable from the exact value on a standard-resolution 
graph — comes from performing a reversion of series of a 
higher-order version of Eq. (^) to yield g as a function 
of a\ 



and then inserting this result into Eq. (j^): 
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(a^r-7(a^)-i + | + |a^' 



(6) 



(7) 



Eq. ^ should prove quite useful in analyzing data, since 
it provides an excellent value for A as a function of 
the variance of the TWD, assuming the validity of the 
CGWD description. We caution that all four terms must 
be kept in order to obtain a good estimate of A from 
Eq. (0). Wc also warn that, as discussed in Sec. VI, the 



effects of discreteness may lead to inconsistencies with 
this estimate for highly misoriented vicinal surfaces. 



III. GAUSSIAN FITS OF THE GENERALIZED 
WIGNER DISTRIBUTION 

A characteristic feature of the CGWD is that as g be- 
comes larger, the curve can be better approximated by 
a Gaussian. This feature should be expected, since it is 
accepted that TWDs for strong repulsions are well de- 
scribed by Gaussians. We quantitatively assessed the de- 
gree of agreement. One measure is the oi a fit of the 
CGWD to a Gaussian form (with the three parameters — 
peak position, prefactor, and standard deviation — as ad- 
justable parameters). We find that this measure of the fit 
improves exponentially with increasing g. (Specifically, 
0.012144 exp(—0.5249p) provides a close upper bound of 
for g > 1.) A second and more useful measure is the 
relative difference of the standard deviation ctg of the fit- 
ted Gaussian from the actual standard deviation aw of 
the CGWD, given by the square root of the second mo- 
ment of the CGWD about its mean of unity. Using Eq. 
(^ we find that this relative difference is well described 
by the formula 
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0.0568 0.0138 



g- 



(8) 



where the expression for the second moment of the TWD 
with respect to the origin, fi'2, is given explicitly as Eq. 
(11) of GE or Eq. (8) of EP. Thus, at the calibration 
point for repulsive interactions {g = 4, for which an ex- 
act solution exists) the agreement is around 1%, and im- 
proves monotonically with increasing g. For this range 



{g > 4) differences between estimates of A obtained from 
CGWD and the various Gaussian fit methods are pre- 
dominantly due to different philosophies of extracting A 
from a rather than from differences in the fitting meth- 
ods. 

As discussed at length in EP and GE, there are several 
distinct theories for extracting the dimensionless interac- 
tion strength A from ac- Monte Carlo calculations |^ 
indicate that the CGWD provides an excellent estimate 
of A over the range of physical values of this repulsion, 
as well as for stronger values. Thus, as remarked at the 
end of the previous section, it is the wisest strategy to 
use Eq. (Q) to estimate A from a deduced from the TWD 
rather than to use the predictions of one of the Gaussian 
approximations discussed in Table 1 of GE. 



IV. EFFECTS OF PERTURBED 
NORMALIZATION OR MEAN 

The CGWD is a normalized TWD with unit mean. 
In GE, the mean was determined straightforwardly from 
the first moment. The independent variable (the terrace 
width) was then scaled by this value, and the distribution 
normalized. In the course of analyzing TWDs, it became 
obvious that the normalization of the data sets by to- 
tal area (that is, the zeroth moment) and first moment 
provides qualitative agreement with the CGWD — that 
is, the "best fit" CGWD produces a skew distribution 
that roughly matches the TWD — but it does not match 
closely enough to reproduce the correct peak position. In 
order to motivate the more satisfactory treatment of ex- 
perimental TWDs in section in this section we discuss 
the effects of perturbations of the mean step separation 
and of the normalization on the variables important for 
extracting interaction strengths (a for a Gaussian fit and 
g for a Wigner fit). Such perturbations might arise in ex- 
perimental data either due to statistical fluctuations or 
due to physical causes, such as perturbations of the step- 
step interaction potential A/i'^ or an incomplete equili- 
bration of the vicinal surface. 

To this end, we created an ideal data set by sampling 
the appropriate distributions at regular intervals. This 
ideal set was then perturbed by various factors not ex- 
ceeding 15%, either by shifting the mean or by scaling 
each point to increase the area under the curve. These 
perturbed sets were then fit as in GE, by normalized fit- 
ting functions with unit mean. Since the true value of g 
or a for our ideal data set is known, it is simple to deter- 
mine the error due to the perturbations. In some cases, 
the errors behave in complicated ways. 

In the equations. Act is the fitted value of a minus 
the known value of a (and similarly for Ag); A/iq (or 
A/zi) indicates how much the area under (or the first 
moment of) the constructed curve exceeds the "proper 
value." (Moments about the origin are defined in Eq. 
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(10) of GE. Here for convenience — since we are interested 
only in differences — we neglect the primes. The effects 
of errors in normalization can be described rather sim- 
ply. The fitted [normalized] curve becomes narrower as 
the area under the raw curve increases. For a Gaussian, 
the fractional change in a is approximately linear in the 
fractional error of the integrated TWD, with a prefactor 
about 2/3: 



A(T/(T|cr=0, 



30 



-0.68A^o + 0.81(A/io) 



(9) 



The coefficients in Eq. (^) are insensitive to the value 
of a: if the standard deviation of the raw curve is reduced 
from 0.30 to 0.20, the linear coefficient is unchanged, 
while the quadratic coefficient is reduced slightly to 0.80. 
For the CGWD, the fit is even more nearly linear: 



e=4.o 



1.38A/^o 



(10) 



Again increased area leads to an effectively sharper 
distribution. The linear coefficient is nearly double that 
in Eq. (p^, as one might expect from Eq. (13) of GE. 
This coefficient again is insensitive to the value of g of 
the raw distribution: for g = 7.0 it dips slightly to 1.37. 

Errors in the mean of the distribution create errors in 
the fit that are not so easy to describe. The changes in 
the fitted parameters are quadratic rather than linear in 
A/xi, and the coefficients depend strongly on the value of 
a or g of the raw distribution. 

For Gaussians, we find that the following expression 
provides a good approximation for standard deviations 
between 0.2 and 0.4 (corresponding to 1.5 < g < 9): 



Aa/a« (l/2)(AA*i/a) 



(11) 



Appendix B provides an analytic derivation of this ap- 
proximation as the leading-order term in an expansion 
of the appropriate Gaussian integral. Eq. (|ll]) can also 
be generated from straightforward fitting of numerical 
data.[] 

Thus, as might be expected since the Gaussian is sym- 
metric about its peak, the error is insensitive to the sign 
of the error in the mean of the raw distribution. The 
fitted distribution is broader than the raw one, with the 
fractional error of the fitted a dependent roughly on the 
"fractional error" (with respect to a) of the first moment, 
i.e. increasing as the distribution becomes sharper. 

Since the Wigner distribution is not symmetric about 
its peak, the corresponding error in fitting an off-center 
CGWD by a properly centered CGWD should not be 
depend purely quadratically on A/ii. Indeed, we find 



^In the process, one can generate the result Aa/a r 
0.486(J~^ ''^ (A/ii)^, which is numerically superior to Eq. (|ll 
but does not satisfy proper dimensional behavior. 



over the range 1 < g < 8 that an excellent approximation 
is 

lAgjg « (0.3(3 - 3.0)A^i + (-2.0^ + 0.4)(A^i)2 . (12) 

Analogous to the previous result for u, the fractional 
error of g has strong quadratic tendencies, with the mag- 
nitude of the curvature increasing with increasing g. The 
linear term complicates behavior, causing g to increase 
for small shifts of the curve to the right. Evidently for 
some ^(-dependent offset, the best fit will coincidentally 
give the true value of g. 



V. WIGNER DISTRIBUTION AS A 
2-PARAMETER FIT 

In fitting experimental TWDs, it becomes apparent 
that in many cases — particularly when the data are rel- 
atively poor — the CGWDs giving the best fits have first 
moments different from the first moments of the data. 
GE noted that the peak of TWDs can be well fitted by 
treating a Gaussian as a 3-parameter fitting function, 
with the peak position and the prefactor allowed to vary 
along with the standard deviation. (Presumably the pref- 
actor differs from its expected value, set by normalizing 
the Gaussian, because of the existence of a small "hump" 
sometimes observed at large values of s [see below].) In 
contrast, it is not clear how such arbitrary modifications 
could be made to the CGWD, nor is it clear what phys- 
ical information could be extracted from a CGWD with 
arbitrary modifications. 

From a basic perspective, though, it might be desirable 
to determine the scaling length (the "effective mean," 
which equals the first moment for ideal CGWDs) and 
the variance in a single fitting procedure rather than to 
find this length first from the first moment or otherwise. 
For the following discussion, we denote by i the effective 
mean determined as one parameter of a two-parameter 
least-squares fit of the data to a CGWD, the other pa- 
rameter being the exponent g. This refined scaling im- 
plies that the argument of Pg should be Iji. It is conve- 
nient to introduce a new adjustable parameter a which 
gives the ratio of I to the actual mean step separation (i) . 
Since s — still defined za Ij {€) — is the natural variable 
to use in describing data, our refined scaling translates 
into replacing s by s/a in the argument of the distribu- 
tion. If the integration variable s were also replaced by 
s/a, then the refined scaling would amount to a redefini- 
tion of a dummy variable, and normalization would still 
be realized. Since the independent variable is kept as s, 
the extra factor is associated with P{s) instead: 



_ Pgis/a) 



(13) 
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In other words, the first moment of the distribution, /ii = 
{tj occurs at l/a times the optimal characteristic terrace 
width I. 

We used Mathematica® regression routines to fit the 
experimental data by minimizing the value of ^is a 
function of the adjustable parameters. Since the values 
of s are quantized (cf. Sec. there was assumed to be 
no error in these values. For simplicity, all data points 
were weighted equally. 



A. Copper: Moderately Strong Repulsions 

Our findings for vicinal Cu surfaces are presented in 
Table 1, which is similar to Table 2 of GE, but contains 
many cases of "poor" data omitted in GE. In order to fa- 
cilitate discussion, TWDs were divided by GE into three 
groups based on a visual assessment of their quality: 

• A "good" TWD changes height essentially mono- 
tonically below the peak and again above it; there 
are no dips, humps, or double peaks, and there is 
minimal scatter in the data points. "Good" data 
are indicated by a "+" in Table 1. 

• An "OK" TWD has more scatter, with small dips 
and peaks introduced by variations (within the lim- 
its of the general margin of error) of single data 
points. "OK" data are indicated by a "0" in Table 
1. 

• A "poor" TWD has a double-peak or hump at large 
s; correspondingly, the position of the (main) peak 
occurs noticeably below s = 1, even when the peak 
is fairly narrow and the skewness minimal. The 
judgment that this data is "poor" is based both on 
the intuition of the experimenter and on the fol- 
lowing argument: A second peak at large s would 
be characteristic of the onset of faceting; however, 
"poor" data tends to occur at high temperatures, 
whereas faceting should be more important at low 
temperatures. "Poor" data are indicated by a "— " 
in Table 1. 

As expected, the Gaussian distribution yields a rea- 
sonable, but not exceptional, fit to the data; it worked 
especially well on surfaces with low temperatures, so rela- 
tively large A. As an example of good data — exemplified 
by the vicinal (1 1 13) surface at 300K, depicted in 
Fig. 1 — the [three-parameter] Gaussian yields a value 
of about 0.0072. The single-parameter CGWD fit gives a 
slightly worse fit to the data, having a of 0.0078. For 
the two-parameter Wigner fit (Eq. (p^), the x'^ value 
improves by better than a factor of two, to 0.0037, with 
a value of g increased slightly (from 6.4 to 6.5, leading to 
a value of a closer to that from the Gaussian fit. In this 
case, the optimal fit using Eq. ( p^ is obtained by scal- 
ing the terrace widths with a value that is 96.5% of that 



given by the first moment of the distribution. In other 
words, the first moment of the TWD is 3.6% greater than 
the value of the mean spacing associated with the best 
fit of the distribution. 

In Fig. 2, we display results for this same vicinal Cu 
surface at 378K as an example of poor data, with a large 
shift in the effective mean. In this case, having extra de- 
grees of freedom in the fit makes a sizable difference. For 
the three-parameter Gaussian fit, the x^ is 0.035; x^ 
creases to 0.042 for the single-parameter Wigner fit and 
to about half that value, 0.025, for the two-parameter 
fit, all these values being half an order of magnitude 
larger than in the previous, good case. The value of g in- 
creases noticeably — from 2.5 to 3.0 — when the refined 
scaling is allowed (and rises to 4.3 for the shifted-mean 
method) . The refined scaling factor for terrace widths is 
0.867, meaning that the exphcit average {£) of the TWD 
is 15.3% greater than the value of the mean spacing asso- 
ciated with the best fit of the distribution. Characteristic 
of this sort of data is the hump on the high-s side of the 
peak, which distorts the single-parameter CGWD fit so 
that it poorly reproduces the peak region. 

We emphasize the following general trends in Table 1: 
In almost all instances, the value of £ derived from the 
two-parameter fit to a CGWD is smaller than /ii — {£) 
given by the first moment (the average) of the TWD; like- 
wise, the directly measured values of a are almost always 
larger than the values obtained by any of the three fitted 
curves. (Cf. Sec. VII.) The value of g is higher for the 



scaled fit than for the single-parameter CGWD fit, and 
the associated value of a typically closer to that deduced 
from the Gaussian fit. For "good" data, the change of 
value of /i is of order a few percent, and the change in g 
and a is negligible. For "poor" data, the refined scaling 
factor is at least twice as large and the two-parameter- 
fit curve is narrower than the single-parameter-fit curve. 
The tails or humps in the experimental TWDs seem to 
be responsible for the systematic discrepancies in the fits, 
especially the smaller mean and smaller variance of the 
fits relative to the direct measurements. 



B. Platinum: Weak Repulsions 

We have also considered recently reported data for vic- 
inal Ft (110) at room temperature In this system the 
terraces are (1 x 2) reconstructed, and the steps corre- 
spond to 3-unit segments (as would be found in a (1x3) 
reconstruction). The authors in that paper conclude that 
the interaction between their steps is small, but are un- 
able to proceed to a quantitative assessment using preex- 
isting methods: Gaussian methods are utterly inappro- 
priate for this regime of small interactions. 

In Fig. 3, we show single- and two-parameter Wigner 
fits of the data. For the former, g = 2.06 (i = 0.03_09), 
with a x^ of 0.008. With the latter, the optimal £ for 
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determining s is 91% of {£) predicted by the average of 
the data (viz. a = 0.91); g rises to 2.24 (A = 0.134) and 
the quahty of the fit improves to — 0.003. 

Thus, the high-s bulge does not seem to be peculiar to 
the vicinal Cu systems of GE. We do not understand the 
physical origin of the systematic need for refined scaling 
of experimental data. We see no comparable effect in our 
companion Monte Carlo simulations, reported elsewhere 



VI. EFFECTS OF DISCRETENESS ON 
CONTINUUM MODELS OF A TWD 

Due to the crystalline nature of the surface, the TWD 
is a discrete rather than a continuous function: the TWD 
should have a sizable number of counts only at values of 
£ that are a± times the sum of an integer and a con- 
stant fractional offset determined by the terrace and the 
orientation of the steps. (E.g., this offset is 1/2 for close- 
packed steps on {1 0} surfaces of fee crystals.) For 
simplicity we neglect this offset in this paper, setting it 
to zero (as on {1 0} surfaces of sc crystals). Thus, s 
can only take on the values sl = La±/{£) = L/{L), L 
being a positive integer. It is very tempting simply to ap- 
ply formulae derived for the continuous TWD given by 
Eq. (|^). In this section we discuss the potential difficul- 
ties posed by the discrete nature of the TWD. Inspired by 
the scaling of discrete TWDs |l^ , we construct a discrete 
generalized Wigner distribution (DGWD) TWD given by 



Sgs'^exp {-bgs^) ^ S{s - sl), 

L 



(14) 



where tig ~ a^/ (L) is determined by the requirement of 
normalization. 

Although bg was defined so as to make the mean of 
the CGWD unity, there is no guarantee that the same 
parameter will make the mean of the DGWD unity; like- 
wise, the two functions may have different variances. We 
chose values of (L) and g to specify a DGWD and then 
numerically performed two-parameter fits using CGWD 
formulae [Eqs. (|l|)-(^, (p^] to produce estimates of Qc- 
Anticipating greater interest in behavior as a function of 
A than of g, we converted our results for gc to Ac using 
Eq. (|_ 

Fig. ^ shows the difference of the fitted value Ac and 
the "parent" value A as a function of this A for several 
mean widths (L). As may be expected, as the TWD 
becomes narrower (i.e. for sufficiently large A or g), Ac 
becomes an decidedly unreliable estimate for A; based 
on examination of the cases {£)/a± = 2-6, this break- 
down appears to occur for g near (i)^. This threshold 
corresponds to sl+i — sl = o,±/{£) = {L)~^ = o. Thus, 
for (L) < 4, this breakdown occurs in the region of phys- 
ical interest (cf. dashed curves in Fig. |^). On the other 



hand, for (L) > 4, Ac provides a reasonable estimate of 
A over the range of physically reasonable dimensionless 
repulsions, where the effects of discreteness are most pro- 
nounced for small values of (L) of A. Note also that in 
each case there is a substantial peak in | Ac — A| for small 
A. Fig. H shows the reduction in the error in Ac as (L) 
increases, at fixed values of A. 

In summary, we have raised a flag of caution when 
analyzing the fluctuations of highly misoriented vicinal 
surfaces in a conventional framework. The case of (L) = 
3 corresponds to (1,1,7) for close-packed steps on surfaces 
vicinal to {1 0} planes of fee crystals. Thus, one should 
view with some suspicion the unusually large values of g 
and A reported for the single temperature at which this 
vicinal Cu surface was measured. For {111} surfaces, 
the corresponding Miller indices are (5 3 3) for A steps 
({1 0} mierofacets) and (2 2 1) for B steps ({1 1 1} 
microfacets) |pT| . 

We also emphasize that this behavior is not a va- 
gary of Wigner distributions. Misorientation causes sim- 
ilar problems when the mean and variance of discretized 
Gaussian TWDs are analyzed as though they were con- 
tinuous Gaussian functions. For more convenient com- 
parison with the above Wigner distribution, we used 
Eq. (|^) to relate the variances and values of g. We 
found that estimates of g based on the variance of the 
discretized Gaussians approached the undiscretized value 
monotonically, rather than oscillating as in the case of the 
Wigner distribution, and that the approach to the undis- 
cretized value of g is actually somewhat slower in the 
Gaussian case than in the Wigner case. The Gaussian 
case also showed a breakdown at large values of g (small 
CT^) similar to the Wigner case. 



VII. STATISTICAL UNCERTAINTIES DUE TO 
FINITE SAMPLING SIZE 

By truncating Eq. (|^) at the second term, we can create 
an estimator g for g: 



(15) 



where is a random variable that is an estimator of . 
For small cr^ , though, the Wigner distribution approaches 
a more familiar Gaussian distribution, as discussed in 
Sec. 0. For a Gaussian distribution, the sampling errors 
from a sample of size iVgamp for cr^ are given by [p^ 



2a^ 



1) 



(16) 



Accordingly, the standard deviation of the estimated val- 
ues of g can be seen to increase with increasing values of 
g: 
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\/var(e) = 



1 



^+4 



(17) 



In this section we explore the effects of statistical fluc- 
tuations on the estimated value of g by performing some 
well-defined numerical experiments. The results are thus 
applicable to "ideal" data. In fact there apparently are 
systematic effects, noted earlier, in real data that limit 
the applicability of some deductions. 

Specifically, we begin with the following simple proce- 
dure: First, we independently choose iVsamp values of s 
using the same known DGWD as the probability density 
function for each selection. Second, we fit this artifi- 
cial TWD using the two-parameter Wigner distribution 
Pg.a{s) to determine g, taking each point to be weighted 
equally in accordance with standard practice Third, 
we repeat this process a large number of times and mea- 
sure the standard deviation of the fitted values of g as 
well as any systematic bias in the fitted estimates. 

Fig. ^ shows the result of this procedure, with one 
million independent TWDs produced for each value of 
g and each TWD consisting of 801 independent values 
of s drawn according to a DGWD. Clearly the linear re- 
lationship between the ■\/var(g) is maintained, but the 
slope is somewhat larger than predicted by Eq. (]l7|). 

Another way of estimating g is to measure directly the 
mean and variance of the TWD and to insert them into 
Eq. (^). Repeating our procedure of creating artificial 
TWDs, we accordingly estimate g using Eq. (^, again 
analyzing the variance of the estimates as above. As 
seen in Fig. |^, the resulting estimates of g have vari- 
ances given almost exactly by Eq. (^7|) and noticeably 
smaller (though not by a large factor) than the variances 
given by the traditional, uniformly- weighted nonlinear 
least-squares fits. This finding means that not only is it 
possible to use simple analytic functions to find g and 
A instead of using two-parameter nonlinear least-squares 
fits, but also that doing so is statistically better! 

This result appears to be contrary to the belief that 
performing a least-squares fit to an appropriate smooth 
function is desirable to minimize the effects of statistical 
fluctuations. It seems likely, though, that the real prob- 
lem lies in the weighting of the data in the fit. It has 
been suggested that greater weight should be given to 
the points near the peak of the TWD, so we once again 
repeat our procedure, this time making a least-squares 
flt in which each point is weighted proportionally to the 
measured value of P. As Fig. || shows, the standard de- 
viation of g again varies linearly with but with a slope 
that is slightly higher than that of the uniformly- weighted 
case. In retrospect, this result should not be surprising, 
since each point on the TWD represents the result of 



samp 



binomial experiments {i.e., Bernoulli trials: either 
the measurement of step separation gives this distance 
sl or some other distance). Elementary statistics p^ ] 
shows that the statistical error of binomial experiments 



is smallest when the probability of success is nearly zero 
or nearly one — in our case, for points on the TWD with 
P{s) « 0. However, devising a naive weighting by the 
reciprocal of the variance of each point on the TWD is 
problematic when points for which the measured value of 
P{x) is equal to zero; these points would receive infinite 
weight, yielding nonsense results. Even if one circum- 
vents this problem, there is still the problem that the 
points are not uncorrelated, which is a requirement of 
the least-squares procedure ]l^ ; the normalization con- 
dition imposes a (weak) correlation between points. We 
can avoid these problems by simply using the mean and 
variance of the TWD in Eq. (|) to find g or Eq. ^ to 
find A. 

Motivated by these numerical experiments, we com- 
puted directly (from the histogram data, after normal- 
ization and adjustment to unit mean) the standard de- 
viation (Jdir (cf. Table 1). This result is systematically 
higher than the cr's obtained from the various fitting tech- 
niques. The difference is modest for good data but pro- 
nounced for poor data. Thus, as mentioned in Sec. 0, 
it appears to come from the curious high-s undulations 
that plague poor data. The fitting techniques, being less 
sensitive to these points, give values of a that are less 
distorted by them. 

Finally, we note that single STM images do not al- 
low for a large number of independent values of s. The 
number of independent measurements is generally much 
smaller than the total number of measurements, due to 
correlations between the measurements. Although a pre- 
cise determination of the effects of correlations on fitted 
parameters would be rather involved, a working estimate 
of the number of "independent" measurements — from 
which uncertainties can be estimated — can be made in 
the following way. First, one obtains the terrace width 
(-n{y) between steps n and n-f 1 for each position y along 
the steps. Then the correlation function |1J| 



Cn{y) 



_ {£o(Q)en{y)) - {if 



{P) 
1 



{lY 



{p) - {ly 



i:yLlin'{y')tn-+n{y' + V) 



{N-n){Ly-y) 



-{£)' 



(1! 



is calculated, where N is the number of terraces in the 
image {i.e., iV -I- 1 is the number of steps). The corre- 
lation function along the steps decays exponentially as 
Cf){y) ~ 'i'><v{—y/^v), where is the intrastep correla- 
tion length^] ^y, is given by Eq. (18) of Ref. ||l|, but 
the safest procedure is simply to measure it. The corre- 
lation function between steps, on the other hand, is more 



As discussed in Ref. ^], from the Gruber- Muffins ^] 



perspective — ^^^^ ^ for A — and £,y 



for A->0. 
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complicated; Ci(0) is negative but the trend is for 
the absolute value of C„(0) to decrease rapidly with in- 
creasing n. We define yc to be the smallest value of y for 
which 

|Co(2/)|<c Vy>yc, (19) 
and likewise ric to be the smallest value of n for which 



|C„(0)|<c Vn>nc 



(20) 



where c is a small cutoff (we recommend c — Q.l). The 
number of "independent" terrace widths will be approx- 
imately {Ly/yc){N/nc). 

As an example, we performed a Monte Carlo simula- 
tion of a system with A — 0. For simplicity, we chose ksT 
to be equal to the energy for producing a kink, and we 
chose the mean distance between steps to be ten lattice 
units. We measured « 15 (consistent with theoret- 
ical predictions, see precedingy footnote) and yc ~ 40, 
and we observed that |C„(0)| < 0.1 for all n > 3. Sup- 
pose this had been an STM image representing a square 
region of the crystal 200 lattice units on a side; then 
there would be approximately 20 terraces in the image, 
and (200/40) (20/2) = 50 independent widths — much 
smaller than the 4000 independent measurements that 
one might naively suspect. As a result, we see that the 
uncertainty in statistically derived quantities such as the 
measured value of A are an order of magnitude larger 
than the naive estimate. Lowering the temperature rel- 
ative to the kink creation energy would have the effect 
of further reducing the number of independent measure- 
ments and thus increasing the uncertainty in the mea- 
sured value of A. 

With such small samples, the measured TWD can dif- 
fer distinctly from the DGWD due to statistical fluctu- 
ations alone. In order to demonstrate this idea, we pro- 
duced 20 TWDs, each consisting of 400 independent val- 
ues oi s = sl sampled from a DGWD with g = 5 and 
(L) = 6, and fitted each TWD with a DGWD. (In this 
model, Sl — L/(L); there is no offset between successive 
terraces.) Fig. shows the TWDs with the lowest and 
the highest values of x^. Curiously, in this particular 
case the TWD with the largest value of happens to 
produce better estimates of both g and (L) than does the 
TWD with the smallest value of x^- In no case, however, 
do we see the shoulders or second peaks in the TWD 
at large values of s as occur systematically in the "poor" 
data of Fig. || here or Fig. 5b in Ref. ^ . Since the "poor" 
data were based on several dozen independent STM mea- 
surements, they should be statistically comparable to the 
data of Fig. ^ but the systematic deviation indicates that 
the "poor" data cannot be entirely understood within the 
framework of a generalized Wigner distribution. 



VIII. CONCLUSIONS 

In this paper we have performed several numerical ex- 
periments and analyses to understand better the TWDs 
derived from physical data from vicinal surfaces. We have 
quantitatively studied how the Wigner distribution ap- 
proaches a Gaussian for large dimensionless interactions, 
and shown that for most systems of physical interest the 
standard deviation of the terrace width can be estimated 
from either distribution with little difference. 

The mean step width can be estimated in a variety of 
ways for both Wigner and Gaussian distributions; two 
reasonable but inequivalent choices are directly averag- 
ing the step width and fitting the TWD to the desired 
distribution using a nonlinear least-squares routine. In 
Sec. IV we discuss the effects of using an estimated mean 
that differs from the least-squares estimate. On the other 
hand, the adjustment of the normalization of the curve 
to obtain a "better" fit is unjustified; even if the visual 
agreement appears to improve, no results from theory 
— such as interaction strengths — can be meaningfully 
extracted from TWDs with the wrong normalization. 

We have proposed a two-parameter extension of the 
generalized Wigner surmise, which really is just a consis- 
tent fitting of both g and the mean terrace width within a 
single two-parameter least-squares fit. This added flexi- 
bility allows one to deal more fruitfully with poorer-than- 
desirable experimental data, while not changing good 
data (or the data emerging from various numerical simu- 
lations). Thus, this fitting function can applied generally. 
On the other hand, for "good" data, we have shown that 
a simple series expansion based on the directly measured 
mean and standard deviation of the terrace widths has 
better statistical properties. The least-squares fit is more 
robust, but this is a property that is useful only when the 
Wigner distribution does not capture all of the important 
physics, such as the role of defects or more complicated 
step-step interactions; in such cases, the estimated val- 
ues of the step-step interaction from the Wigner — or 
the Gaussian — distribution must be treated with great 
caution. 

Finally, we emphasize the importance of using many 
STM measurements to insure good statistics and the de- 
sirability of calculating the correlations between terrace 
widths within individual STM images. As we saw in sec- 
tion VII, the typical STM image suitable for measuring 



terrace widths will contain no more than about 50 inde- 
pendent terrace width measurements, almost two orders 
of magnitude less than the 4000 total terrace width mea- 
surements. 
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APPENDIX A: DERIVATION OF EXPANSIONS Using reversion of series, we then find 



In this appendix we derive a useful series expansion 
Eqs. (Q) and (H) for the coefRcient bg in the quadratic 
exponential of the generalized Wigner distribution. For 
convenience, we define the variable 



g+1 



(A 1) 



Then we use Stirling's asymptotic series T{r) = 

7ri/2 exp(-r)r'--i/2F(r): 



-\ 2 



r(r) 



e-(^+^/^){r + l/2YF{r + l/2) 



e-'-(r)'-V2i^(r) 



(A 2) 



where 



F{r) = 1 + + 1 



139 



571 



32, 



(A 9) 



from which we get 



9 = \ {aT" {i + li-') + H-r + Y^'^r + o[{ar] 

Finally, with Eq. (|) we can use these results to find the 
dimensionless interaction constant A\y in terms of a': 



16 



2\-2 



(All) 



127- 288r2 51840r3 2488320r4 



0{r-^) .For A > 0.0525, the relative error in Eq. (|A ll|) is less 



(A 3) 



than 1% (less than 0.1% for A > 0.15). The absolute 
error is less than 1% for A>—l/A. 



We concentrate initially on the first part of the frac- 
tion: 



-(-+1/2) (^ + 1/2)^ 



-rj.r-l/2 



(A 4) 



But 



'^2r 



exp In 



'^2r 



exp 



1 - + i(2r)-' 



It is also straightforward to show that 



\{2r)-^ + \{2rr^ + 0[{2rr^] 



7 2477 

16(2^)"^+ 5760(^'^)"' + ^[(^^)"'] 



F{r + 1/2) 



n 2 



= 1 - ^(2r)-^ + \{2r)-' ^^{2r)-' + 0[(2r)-5] . 



F{r) 

By combining all of these we find 



(A 6) 



(A 5) 



1 _2 13 

m'' ^ 6144' 



(A 7) 



From this formula for 6^, we can use Eq. (^aj) to write a 
as a power series in q~^: 



1 



-1 



-2 



„-3 



16 



7 

384 



-51 



(A 
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APPENDIX B: EFFECT OF DISPLACEMENT OF 
A GAUSSIAN FITTING FUNCTION IN FITS OF 
A GAUSSIAN 



Suppose we have a Gaussian distribution with mean /i 
and variance ct^; we attempt to fit this Gaussian with a 
second Gaussian with a mean /x + A/z and a variance (1 — 
O^cr^, where A/x is fixed and ^ is unknown. Exphcitly, 
we write as a function of and A/i: 



da;<^ 

1 

(T\/27r 
1 



exp 



(7(1-C)V2^ 



exp 



2^2(1 - C)2 

2 



1 



2cr(l - 2cr0F 

1 

aV2vr[l + (1 =CFI 



2a2 



exp 



2a2{l + (l-C)2} 



Am\ 

2(7 / 



+ 0' 



3 / A^ 
2 I 2^ 



C (Bl) 



Since the optimum fit is found by minimizing x^, we set 
dx^/d^ = in Eq. (B 1) and remove the overall prefactor 
l/((T0r) to get: 



3 /A/i 
2 \ 



V 2(7 y 

Solving for C, we find 



c 



A/i 
2^ 



A^y 

2(7 ) 



2(7 / 



(B 2) 

C + o(C'). 

(B 3) 



Since Acr^cr^ = 2Act/ct, Eq. (|B 3|) leads to Eq. (hj 
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FIG. 1. The vicinal surface Cu(l 1 13) at 300K is an exam- 
ple of good data. The points show the normalized data from 
the STM image. The short-dashed curve shows the conven- 
tional (three-parameter) Gaussian fit to the data; the fitted 
standard deviation is ac = 0.25±0.01. The long-dashed curve 
shows a fit to a generalized Wigner distribution with the ex- 
ponent Q as the single adjustable parameter. The best-fit 
result \s, Q = 6.4 ± 0.5, leading via Eq. to the esti- 

mate aw = 0.26 ± 0.01. The terrace widths are scaled by 
the mean spacing determined from the average of the data. 
In the solid curve, the Wigner distribution is treated as a 
two-parameter function. We now find q = 6.5 ± 0.3, leading 
again to aw ~ 0.26 ± 0.01. 



FIG. 6. Standard deviations in fitted values of q due 
to statistical fluctuations. In each case, fits were made to 
terrace width distributions consisting of A^'samp = 801 values 
of s independently distributed according to a DGWD func- 
tion. Each circle represent a sampling over one million uni- 
formly-weighted two-parameter flts. Each square represents 
a sampling over ten thousand two-parameter flts, in which 
the weight for each point in the TWD was weighted propor- 
tionally to -P(s). Each diamond represents a sampling over 
ten thousand applications of Eq. (p|). The line is the predic- 
tion of Eq. (|l7[). Both in terms of computational difflculty 
and statistical quality, Eq. is clearly superior to nonlinear 
least-squares flts. 



FIG. 2. The vicinal surface Cu(l 1 13) at the higher tem- 
perature 378K is an example of poor data. As in Fig. |l| the 
points show the data, the short-dashed curve a conventional 
Gaussian flt, the long-dashed curve a single-parameter Wigner 
flt, and the solid curve a two-parameter Wigner fit. For the 
Gaussian flt, we get aa = 0.30 ± 0.04, a broader distribu- 
tion than in Fig. 1, as expected for the higher T. In con- 
trast to Fig. ^ there is a considerable difference between the 
two Wigner fits, with the two-parameter version providing a 
much better accounting due to its ability to adjust to accom- 
modate the points near the peak. For the one-parameter fit, 
we find Q — 2.5 ± 0.7, leading to aw ~ 0.39 ± 0.03, while 
for the two-parameter fit, we get g = 3.0 ± 0.5, leading to 
aw = 0.36 ± 0.03. The small undulations in the data on the 
high-s side of the peak, in this example near s = 1.5 and again 
for larger s, is characteristic of poor data. 



FIG. 7. Twenty TWDs were simulated by drawing 
A'samp = 400 values of s according to a DGWD with q = 5 and 
(L) — 6, indicated by the solid black curve. Each TWD was 
then fitted to a single-parameter CGWD, as in Eqs. (l)-(3). 
Shown are the TWDs with the smallest (•) and largest (□) 
values of x^- The flts to these are shown as the black dashed 
curve and the x's, respectively. 



FIG. 3. Analysis of terrace width distributions of 
Pt(llO) using Wigner distributions. The experimental points 
of Ref. 1^ are indicated by dots. As in Figs. ^ and ^ 
the short-dashed curve a conventional Gaussian fit, the 
long-dashed curve a single-parameter Wigner fit, and the solid 
curve a two-parameter Wigner fit. 



FIG. 4. The error in estimates Ac of A derived by using 
formulae for the mean and variance of the continuous general- 
ized Wigner surmise TWD on discrete TWDs, for the physical 
range of A. (L) indicates the mean terrace width in units of 
a±. For (L) = 2 and 3, the ordinate values have been divided 
by 1000 and by 50, respectively, to appear the same vertical 
scale; evidently, discreteness for these narrow terraces intro- 
duces unacceptably large errors, particularly as A increases. 
The smooth curves through these points, to guide the eye, 
are dashed to distinguish them from the cases with broader 
terraces. 



FIG. 5. The error in estimates Ac of A derived by using 
formulae for the mean and variance of the continuous general- 
ized Wigner surmise TWD on discrete TWD. The estimates 
evidently improve considerably with increasing (L) (broader 
terraces, with higher Miller indices). 
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TABLE I: Tabulation of the results of fitting data from various vicinal surfaces of copper to a Gaussian with 
three parameters (labeled by subscript G) and to Wigner distributions with one or with two adjustable param- 
eters (labeled by subscripts 1 and 2, respectively). The temperature in Kelvin is given in the first column 
and the qualitative characterization (+ for good, for fair, - for poor) in the second. The final column, la- 
beled A/i, indicates how much the mean (or first moment) computed directly from the data exceeds the qpti- 
mal mean obtained via the second parameter in the two-parameter Wigner fit; using the notation of Eq. (|l3|), 
we have Afj, = — 1 « 1 — a. Motivated by the discussion of Sec. VIl, we include in the final column the 
standard deviation g^k-c evaluated directly from the normalized (and adjusted to unit mean) histogram data. 
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